228 research outputs found

    Identifying Causal Relations in Legal Documents with Dependency Syntactic Analysis

    Get PDF
    This article describes a method for enriching a dependency-based parser with causal connectors. Our specific objective is to identify causal relationships between elementary discourse units in Spanish legal texts. For this purpose, the approach we follow is to search for specific discourse connectives which are taken as causal dependencies relating an effect event (head) with a verbal or nominal cause (dependent). As a result, we turn a specific syntactic parser into a discourse parser aimed at recognizing causal structures

    LeMe-PT: A Medical Package Leaflet Corpus for Portuguese

    Get PDF
    The current trend on natural language processing is the use of machine learning. This is being done on every field, from summarization to machine translation. For these techniques to be applied, resources are needed, namely quality corpora. While there are large quantities of corpora for the Portuguese language, there is the lack of technical and focused corpora. Therefore, in this article we present a new corpus, built from drug package leaflets. We describe its structure and contents, and discuss possible exploration directions

    Evaluation of Distributional Models with the Outlier Detection Task

    Get PDF
    In this article, we define the outlier detection task and use it to compare neural-based word embeddings with transparent count-based distributional representations. Using the English Wikipedia as text source to train the models, we observed that embeddings outperform count-based representations when their contexts are made up of bag-of-words. However, there are no sharp differences between the two models if the word contexts are defined as syntactic dependencies. In general, syntax-based models tend to perform better than those based on bag-of-words for this specific task. Similar experiments were carried out for Portuguese with similar results. The test datasets we have created for outlier detection task in English and Portuguese are released

    Propuesta para una semántica de las dependencias sintácticas

    Get PDF
    El principal objetivo de este artículo es proponer un modelo formal del proceso de interpretación semántica de las dependencias sintácticas. Definiremos una dependencia sintáctica como una operación binaria que toma como argumentos las denotaciones de dos palabras relacionadas (núcleo y modificador), y devuelve una reordenación de sus denotaciones. Asumimos que esta operación binaria desempeña un papel esencial en el proceso de interpretación semántica

    The Meaning of Syntactic Dependencies

    Get PDF
    This paper discusses the semantic content of syntactic dependencies. We assume that syntactic dependencies play a central role in the process of semantic interpretation. They are defined as selective functions on word denotations. Among their properties, special attention will be paid to their ability to make interpretation co-compositional and incremental. To describe the semantic properties of dependencies, the paper will be focused on two particular linguistic tasks: word sense disambiguation and attachment resolution. The second task will be performed using a strategy based on automatic acquisition from corpora

    Compositional Distributional Semantics with Syntactic Dependencies and Selectional Preferences

    Get PDF
    This article describes a compositional model based on syntactic dependencies which has been designed to build contextualized word vectors, by following linguistic principles related to the concept of selectional preferences. The compositional strategy proposed in the current work has been evaluated on a syntactically controlled and multilingual dataset, and compared with Transformer BERT-like models, such as Sentence BERT, the state-of-the-art in sentence similarity. For this purpose, we created two new test datasets for Portuguese and Spanish on the basis of that defined for the English language, containing expressions with noun-verb-noun transitive constructions. The results we have obtained show that the linguistic-based compositional approach turns out to be competitive with Transformer modelsThis work has received financial support from DOMINO project (PGC2018-102041-B-I00, MCIU/AEI/FEDER, UE), eRisk project (RTI2018-093336-B-C21), the Consellería de Cultura, Educación e Ordenación Universitaria (accreditation 2016-2019, ED431G/08, Groups of Reference: ED431C 2020/21, and ERDF 2014-2020: Call ED431G 2019/04) and the European Regional Development Fund (ERDF)S

    Using the Outlier Detection Task to Evaluate Distributional Semantic Models

    Get PDF
    In this article, we define the outlier detection task and use it to compare neural-based word embeddings with transparent count-based distributional representations. Using the English Wikipedia as a text source to train the models, we observed that embeddings outperform count-based representations when their contexts are made up of bag-of-words. However, there are no sharp differences between the two models if the word contexts are defined as syntactic dependencies. In general, syntax-based models tend to perform better than those based on bag-of-words for this specific task. Similar experiments were carried out for Portuguese with similar results. The test datasets we have created for the outlier detection task in English and Portuguese are freely availableThis work was supported by a 2016 BBVA Foundation Grant for Researchers and Cultural Creators and by Project TELEPARES, Ministry of Economy and Competitiveness (FFI2014-51978-C2-1-R). It has received financial support from the Consellería de Cultura, Educación e Ordenación Universitaria (accreditation 2016–2019, ED431G/08) and the European Regional Development Fund (ERDF)S

    The role of syntactic dependencies in compositional distributional semantics

    Get PDF
    This article provides a preliminary semantic framework for Dependency Grammar in which lexical words are semantically defined as contextual distributions (sets of contexts) while syntactic dependencies are compositional operations on word distributions. More precisely, any syntactic dependency uses the contextual distribution of the dependent word to restrict the distribution of the head, and makes use of the contextual distribution of the head to restrict that of the dependent word. The interpretation of composite expressions and sentences, which are analyzed as a tree of binary dependencies, is performed by restricting the contexts of words dependency by dependency in a left-to-right incremental way. Consequently, the meaning of the whole composite expression or sentence is not a single representation, but a list of contextualized senses, namely the restricted distributions of its constituent (lexical) words. We report the results of two large-scale corpus-based experiments on two different natural language processing applications: paraphrasing and compositional translationThis work is funded by Project TELPARES, Ministry of Economy and Competitiveness (FFI2014-51978-C2-1-R), and the program “Ayuda Fundación BBVA a Investigadores y Creadores Culturales 2016”S

    Entity linking with distributional semantics

    Get PDF
    [Abstract] Entity Linking (EL) consists in linking name mentions in a given text with their referring entities in external knowledge bases such as DBpedia/Wikipedia. In this paper, we propose an EL approach whose main contribution is to make use of a knowledge base built by means of distributional similarity. More precisely, Wikipedia is transformed into a manageable database structured with similarity relations between entities. Our EL method is focused on a specific task, namely semantic annotation of documents by extracting those relevant terms that are linked to nodes in DBpedia/Wikipedia. The method is currently working for four languages. The Portuguese and English versions have been evaluated and compared against other EL systems, showing competitive range, close to the best systemsMinisterio de Economía y Competitividad; FFI2014-51978-C2-1-
    corecore